Repetition

In this section, we will learn about source separation approaches that exploit a common feature of musical signals: repetition. In doing so, we will gain some understanding of the mechanics of source separation and how the an algorithm can assumptions about a signal to separate

In this section, we will explore three algorithms that attempt to separate a repeating background from a non-repeating foreground. The basic assumption here is 1) that there is repetition in the mixture, and 2) the repetition captures what we want to separate. This assumption holds quite well if we want to separate a singer from a backing band, but might not work if we want to isolate a drum set from the rest of the band because the drum set is usually playing a repeating pattern.

REPET

The first algorithm we will explore here is called the REpeating Patern Extraction Technique or REPET [RP12]. REPET works like this:

  1. Find a repeating period, \(t_r\) seconds (e.g., the number of seconds which a chord progression might start over).

  2. Segment the spectrogram into \(N\) segments, each with \(t_r\) seconds in length.

  3. “Overlay” those \(N\) segments.

  4. Take the median of those \(N\) stacked segments and make a mask of the median values.

We’ll use REPET to demonstrate how to run a source separation algorithm in nussl.

%%capture
!pip install git+https://github.com/source-separation/tutorial
# Do our imports
import nussl
import matplotlib.pyplot as plt
from common import viz

Let’s download an audio file that has a lot of repetition in it, and inspect and listen to it:

audio_path = nussl.efz_utils.download_audio_file('historyrepeating_7olLrex.wav', verbose=False)
history = nussl.AudioSignal(audio_path)
history.embed_audio()

plt.figure(figsize=(10, 3))
nussl.utils.visualize_spectrogram(history)
plt.title(str(history))
plt.tight_layout()
plt.show()
../../_images/repetition_5_1.png

Now we need to instantiate a Repet object in nussl. We can do that like so:

repet = nussl.separation.primitive.Repet(history)

Now the repet object has our AudioSignal, it’s easy to run the algorithm:

repet.run()
[<nussl.core.masks.soft_mask.SoftMask at 0x7f63f8c35410>,
 <nussl.core.masks.soft_mask.SoftMask at 0x7f63fa597550>]

Oh, look! The repet object returned masks! We can get audio signals back by doing the following:

r_estimates = repet.make_audio_signals()

We can also chain both of those operations if we don’t care about the intermediate steps:

r_estimates = repet()

Let’s check out the masks that repet made:

viz.show_sources(r_estimates)
../../_images/repetition_15_0.png

And there are our foreground and background sources!

Making it Interactive

nussl has hooks for gradio, so we can make our repet object interactive. All algorithms in nussl have this ability.

repet.interact()
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/nussl/separation/base/separation_base.py in interact(self, add_residual, source, label, share)
    108         try:
--> 109             import gradio
    110         except: # pragma: no cover

ModuleNotFoundError: No module named 'gradio'

During handling of the above exception, another exception occurred:

ImportError                               Traceback (most recent call last)
<ipython-input-9-a9787b01b3d8> in <module>
----> 1 repet.interact()

/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/nussl/separation/base/separation_base.py in interact(self, add_residual, source, label, share)
    110         except: # pragma: no cover
    111             raise ImportError(
--> 112                 "To use this functionality, you must install gradio: "
    113                 "pip install gradio.")
    114 

ImportError: To use this functionality, you must install gradio: pip install gradio.

Go ahead and play around with REPET. See what types of audio work and what types of audio doesn’t work. How does it work on electronic loops? How does it work on ambient music?

Review

The process of running a separation algorithm in nussl was only a few steps:

  1. Instantiate a separation object with an audio signal. E.g., repet = nussl.separation.primitive.Repet(history)

  2. Run the object to get the results. E.g. repet()

Now let’s look at a few other algorithms that leverage repetition in a musical recording and compare results to REPET.

REPET-SIM

REPET-SIM is a variant of REPET that doesn’t rely on a fixed repeating period. In fact, it doesn’t rely on repetition as explicitly as REPET does. REPET-SIM calculates a similarity matrix between each pair of spectral frames in an STFT, selects the \(k\) nearest nieghbors for each frame, and makes a mask by median filtering the bins for each of the selected neighbors.

We can run REPET-SIM the same way we can run REPET:

repet_sim = nussl.separation.primitive.RepetSim(history)
rs_estimates = repet_sim()

viz.show_sources(rs_estimates)
../../_images/repetition_20_0.png

And let’s make an interactive one as well:

repet_sim.interact()
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/nussl/separation/base/separation_base.py in interact(self, add_residual, source, label, share)
    108         try:
--> 109             import gradio
    110         except: # pragma: no cover

ModuleNotFoundError: No module named 'gradio'

During handling of the above exception, another exception occurred:

ImportError                               Traceback (most recent call last)
<ipython-input-11-f17a36515064> in <module>
----> 1 repet_sim.interact()

/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/nussl/separation/base/separation_base.py in interact(self, add_residual, source, label, share)
    110         except: # pragma: no cover
    111             raise ImportError(
--> 112                 "To use this functionality, you must install gradio: "
    113                 "pip install gradio.")
    114 

ImportError: To use this functionality, you must install gradio: pip install gradio.

2DFT

We can also use a Two-dimensional Fourier Transform (2DFT) of a spectrogram to find repeating and non-repeating patterns. Repeating sections show up as peaks in the 2DFT and non-repeating parts are everything else. We can use a peak picker to separate the repeating from non repeating parts. That’s what this algorithm does:

# We can't start a variable name with a number,
# so this object is called FT2D
ft2d = nussl.separation.primitive.FT2D(history)
ft2d_estimates = ft2d()
viz.show_sources(ft2d_estimates)
../../_images/repetition_24_0.png

And let’s make 2DFT interactive too:

ft2d.interact()
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/nussl/separation/base/separation_base.py in interact(self, add_residual, source, label, share)
    108         try:
--> 109             import gradio
    110         except: # pragma: no cover

ModuleNotFoundError: No module named 'gradio'

During handling of the above exception, another exception occurred:

ImportError                               Traceback (most recent call last)
<ipython-input-13-6011817adf0d> in <module>
----> 1 ft2d.interact()

/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/nussl/separation/base/separation_base.py in interact(self, add_residual, source, label, share)
    110         except: # pragma: no cover
    111             raise ImportError(
--> 112                 "To use this functionality, you must install gradio: "
    113                 "pip install gradio.")
    114 

ImportError: To use this functionality, you must install gradio: pip install gradio.

Harmonic-Percussive Source Separation (HPSS)

If you spend enough time visualizing musical signals on a spectrogram, you start to notice that harmonic sounds look similar horizontal stripes on a spectrogram and percussive sounds look similar to vertical stripes. Harmonic-Percussive Source Separation takes advantage of this insight by applying a median filter accross frequency bins (horizontal, or harmonic) and across time bins (vertical, or percussive) to make a mask:

hpss = nussl.separation.primitive.HPSS(history)
hpss_estimates = hpss()[::-1]
# hpss gives harmonic then percussive
# so let's reverse the order of the list
visualize_and_embed(hpss_estimates)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-14-d445a7349b06> in <module>
      3 # hpss gives harmonic then percussive
      4 # so let's reverse the order of the list
----> 5 visualize_and_embed(hpss_estimates)

NameError: name 'visualize_and_embed' is not defined

Next Steps…

There you have it. Four simple algorithms to separate repeating and non-repeating parts and also harmonic and percussive parts.

Next we’ll talk about how we can model timbre using Non-negative Matrix Factorization.